PLOS Digital Health
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Achieving interoperability in machine learning (ML) workflows remains a significant challenge due to the heterogeneity of data types, algorithms, and application domains, as well as the lack of standardized metadata. In this study, we present the development of a specialized ML metadata schema within the context of the Small Data Initiative, a Collaborative Research Center characterized by diverse scientific approaches. We employed an interdisciplinary process combining expert input, iterative r...
Show abstract
BackgroundLarge language models (LLMs) have demonstrated rapid advancements in natural language understanding and generation, prompting their integration into biomedical research, clinical practice, and professional education. However, systematic evaluation of LLMs in specialty-specific domains such as dentistry and periodontology remain limited, particularly regarding multidimensional performance metrics. ObjectiveTo conduct a comprehensive, multidimensional assessment of commercially availabl...
Show abstract
Various healthcare applications based on large language models (LLMs) have emerged as LLMs show improved efficiency and error reduction. Recently, retrieval augmented generation (RAG) has been adopted frequently for LLM applications to solve the problem of hallucinations. Despite the success of RAG, it has its drawbacks, including incomplete semantic meanings, and large-scale dataset requirements. AI Agents have shown great potential in medicine and healthcare applications by leveraging their ri...
Show abstract
This study explores clinician understanding and perception at site lead level towards machine learning (ML) decision support tools for paediatric related emergency care across the UK and Ireland, essential in guiding safe and effective frontline implementation. A cross-sectional online survey was distributed via Paediatric Emergency Research United Kingdom and Ireland (PERUKI) to the lead for digital systems or PERUKI site lead, with one response sought per site. Survey development was in REDCap...
Show abstract
The SingHealth Duke-NUS Academic Medical Center manages over 2,800 clinical faculty members and processes over 400 appointments and promotions annually. The current Promotion and Tenure documentation includes over 30 documents, making it difficult and time-consuming for the faculty to locate specific appointment information. We developed "AskADD" in response to requests for clearer academic career development guidance. This study reports initial alpha testing and subsequent beta testing with 35 ...
Show abstract
This paper re-imagines a world of abundance in the treatment of chronic diseases such as Tpe 2 Diabetes. It asks: what if preventive and diagnostic remedies were widely made available across the world, informed by the latest medical research? As Proof-of-Concept of a proposed solution, the paper describes the development and validation of a local Large Language Models (local-LLMs) based on Graph-based Retrieval-Augmented Generation (GraphRAG) for managing Gestational Diabetes Mellitus (GDM). The...
Show abstract
BackgroundUsing artificial intelligence (AI) to help clinical diagnoses has been an active research topic for more than six decades. Past research, however, has not had the scale and accuracy for use in clinical decision making. The power of AI in large language model (LLM)-related technologies may be changing this. In this study, we evaluated the performance and interpretability of Generative Pre-trained Transformer 4 Vision (GPT-4V), a multimodal LLM, on medical licensing examination questions...
Show abstract
IntroductionTraditional deep learning models for lung sound analysis require large, labeled datasets; multimodal LLMs may offer a flexible, prompt-based alternative. This study aimed to evaluate the utility of a general-purpose multimodal LLM, GPT-4o, for lung sound classification from mel-spectrograms and assess whether a few-shot prompt approach improves performance over zero-shot prompting. MethodsUsing the ICBHI 2017 Respiratory Sound Database, 6898 annotated respiratory cycles were convert...
Show abstract
There are 2.9 million annual neonatal deaths worldwide. Simple, evidence-based interventions such as temperature control could prevent approximately two-thirds of these deaths. However, key problems in implementing these interventions are a lack of newborn-trained healthcare workers and a lack of data collection systems. NeoTree is a digital platform aiming to improve newborn care in low-resource settings through real-time data capture and feedback alongside education and data linkage. This proj...
Show abstract
Artificial intelligence (AI) in healthcare holds transformative potential but risks exacerbating existing health disparities if inclusivity is not explicitly accounted for. This study addresses the disconnected discussions on inclusive medical AI by developing a comprehensive framework, PREFER-IT. This framework is based on the outcomes of a five-day transdisciplinary co-creation workshop that involved 37 experts from diverse backgrounds, including healthcare, ethics, law, social sciences, AI, a...
Show abstract
Skin lesion prediction using artificial intelligence (AI) models is highly dependent on skin tone, yet current approaches largely overlook this critical factor. The Fitzpatrick 17k dataset, which contains six skin tone categories: lighter to darker, is severely imbalanced, with most models biased toward lighter skin tones. Previous efforts to improve overall accuracy fall short: overall accuracy fails to reflect true performance across imbalances. This creates a significant gap, as effective ski...
Show abstract
OBJECTIVETo analyze the frequency of self-disclosed use of AI in research manuscripts submitted to 49 biomedical journals and to identify types of AI tools used, the tasks they assisted with, and factors associated with disclosure. DESIGNCross-sectional study. SETTING49 biomedical journals published by BMJ Group. PARTICIPANTSSubmitting authors of 25,114 empirical research manuscripts including systematic reviews and meta-analyses, submitted between 8 April 2024 and 6 November 2024. MAIN OUTC...
Show abstract
PurposeEmpathy, a cornerstone of human interaction, is a unique quality to humans that Large Language Models (LLMs) are believed to lack. Our study aims to review the literature on the capacity of LLMs in demonstrating empathy MethodsWe conducted a literature search on MEDLINE up to July 2023. Seven publications ultimately met the inclusion criteria. ResultsAll studies included in this review were published in 2023. All studies but one focused on ChatGPT-3.5 by OpenAI. Only one study evaluated...
Show abstract
BackgroundArtificial Intelligence (AI) has evolved through various trends, with different subfields gaining prominence over time. Currently, Conversational Artificial Intelligence (CAI)--particularly Generative AI--is at the forefront. CAI models are primarily focused on text-based tasks and are commonly deployed as chatbots. Recent advancements by OpenAI have enabled the integration of external, independently developed models, allowing chatbots to perform specialized, task-oriented functions be...
Show abstract
BackgroundOne of the most difficult challenges in pediatric telemedicine is to accurately discriminate between the sick and not sick child, especially in resource-limited settings. Models that flag potentially sick cases for additional safety checks represent an opportunity for telemedicine to reach its potential. However, there are critical knowledge gaps on how to develop such models and integrate them into electronic clinical decision support (eCDS) tools. MethodsTo address this challenge...
Show abstract
BackgroundChatbots have the potential to reduce barriers to pre-exposure prophylaxis (PrEP), including lack of awareness, misconceptions, and stigma, by providing anonymous and continuous support. However, in the context of PrEP chatbots are still nascent; they lack personalized informational expertise, peer experiential expertise, and human-like emotional support to promote PrEP uptake and retention. Tailoring information, providing relatable peer experiences, and offering effective emotional s...
Show abstract
Recently developed chatbots based on large language models (further called bots) have promising features which could facilitate medical education. Several bots are freely available, but their proficiency has been insufficiently evaluated. In this study the authors have tested the current performance on the multiple-choice medical licensing exam of University of Antwerp (Belgium) of six widely used bots: ChatGPT (OpenAI), Bard (Google), New Bing (Microsoft), Claude instant (Anthropic), Claude+ (A...
Show abstract
Wearable activity trackers have been recognised as effective tools for physical activity promotion, leading to their integration in healthcare services. Although, some qualitative literature indicated that device users may experience emotional conflict. The current study is the first of our knowledge to directly examine the conflict faced by wearable activity tracker users. A qualitative, exploratory design was followed, with inductive thematic analysis conducted on semi-structured interview tr...
Show abstract
ABSTRACTO_ST_ABSObjectiveC_ST_ABSThe United States Medical Licensing Examination (USMLE) assesses physicians competency and passing is a requirement to practice medicine in the U.S. With the emergence of large language models (LLMs) like ChatGPT and GPT-4, understanding their performance on these exams illuminates their potential in medical education and healthcare. Materials and MethodsA literature search following the 2020 PRISMA guidelines was conducted, focusing on studies using official US...
Show abstract
The paper introduces RAGCare-QA, an extensive dataset of 420 theoretical medical knowledge questions for assessing Retrieval-Augmented Generation (RAG) pipelines in medical education and evaluation settings. The dataset includes one-choice-only questions from six medical specialties (Cardiology, Endocrinology, Gastroenterology, Family Medicine, Oncology, and Neurology) with three levels of complexity (Basic, Intermediate, and Advanced). Each question is accompanied by the best fit of RAG impleme...